feat(unofficial): mock 950dt + huawei-mindie for unofficial runs by Oseltamivir · Pull Request #257 · SemiAnalysisAI/InferenceX-app

Oseltamivir · 2026-04-27T19:40:06Z

Summary

Unblocks loading the Mock-ascend run 25014782858 via the unofficial-run viewer (?unofficialrun=25014782858).

The mock artifacts use:

hw: 950dt — new GPU, Huawei Ascend 950DT (new vendor)
framework: huawei-mindie — new framework
model: deepseek-ai/DeepSeek-V4-Pro — already supported (dsv4)
precision: fp4, sequences 8K/1K, 1K/1K — already supported

Without these mappings, hwToGpuKey('950dt') returns null and rows are silently skipped by the ETL → the unofficial-run viewer shows nothing.

Changes

packages/constants/src/gpu-keys.ts
- HW_REGISTRY['950dt'] — vendor Huawei, arch Ascend, label Ascend 950DT, sort 9. TDP/power/cost set to placeholder 9.99 — Huawei figures are not yet finalized; bump to real values when available.
- New huawei color zone in VENDOR_OKLCH_ZONES (amber 60–90°) and VENDOR_HSL_ZONES (amber 30–60°). To make room in the HSL wheel, AMD's lower band shrinks from 0–60 to 0–30 and NVIDIA's band shrinks from 60–195 to 90–195. AMD/NVIDIA still get >90° each.
packages/constants/src/framework-aliases.ts
- FW_REGISTRY['huawei-mindie'] → label Huawei MindIE.
packages/app/src/lib/dynamic-colors.ts
- Vendor type now includes 'huawei'; getVendor() recognizes the Huawei vendor string.
packages/app/src/lib/chart-utils.ts
- BANNED_HUE_TEST / PREFERRED_ZONE huawei entries — avoids AMD-red (320–40°) and NVIDIA-green (120–195°) in CIELab; prefers amber/yellow (50–95°).

Notes for review

This is added solely so the unofficial-run viewer can render the mock data — it is not intended to ingest mock benchmarks into the production DB. The mock artifacts use tput_per_gpu: 510.2041, median_ttft: 0.0, etc. — fake values.
Cost/power/TDP are placeholders; once real Huawei pricing/specs are known, replace 9.99 and tune the chart sort order.

Test plan

pnpm typecheck
pnpm lint
pnpm test:unit (1896 tests pass across all packages)
On Vercel preview, load ?unofficialrun=25014782858 — confirm 950dt overlay points render under DeepSeek-V4-Pro at 1K/1K, 8K/1K, etc., with conc 16/64/128.

🤖 Generated with Claude Code

Note

Medium Risk
Touches chart data keys, legend reset, URL state, and unofficial-run loading across several providers; behavior is well covered by unit tests but preview validation for the target run is still important.

Overview
Adds Huawei Ascend 950DT (950dt) and Huawei MindIE to shared registries with placeholder power/cost/TDP and new Huawei vendor color zones (OKLch/HSL and CIELab helpers), so Mock-ascend CI artifacts map through ETL instead of being dropped.

The unofficial-run API now accepts bmk_* per-config artifacts when results_bmk is missing (concatenates rows), and sequence handling gains 8k/256 and 8k/625 via islOslToSequence instead of the old 1k/8k-only prefix hack.

Inference charts get multi-sequence overlay (MultiSequenceSelector, i_seq_extra, __seq<compact> synth hwKeys), a “Show as ingested” mode that merges unofficial points into the main scatter pipeline (mergeUnofficialIntoOfficial, i_uoff_ingested), and legend/color/reset fixes so new synth series stay visible. Branch-only defaults auto-pick 8K/256 and output throughput per GPU when an unofficial run loads without URL pins.

^{Reviewed by Cursor Bugbot for commit c9c31a7. Bugbot is set up for automated code reviews on this repo. Configure here.}

Adds Huawei Ascend 950DT GPU and Huawei MindIE framework so the Mock-ascend GitHub Actions run (25014782858) can be loaded via the unofficial-run viewer. TDP/power/cost are placeholder 9.99 — Huawei specs are not yet finalized. - HW_REGISTRY: new 950dt entry, vendor=Huawei, arch=Ascend, sort=9 - VENDOR_OKLCH_ZONES / VENDOR_HSL_ZONES: amber/yellow zone for Huawei (steals 30-60° HSL from AMD's lower band; AMD keeps 300-360 + 0-30) - FW_REGISTRY: huawei-mindie → "Huawei MindIE" - dynamic-colors Vendor type extended with 'huawei' - chart-utils BANNED_HUE_TEST/PREFERRED_ZONE entries for huawei (avoid AMD-red and NVIDIA-green clashes; preferred amber/yellow zone) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-04-27T19:40:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
inferencemax-app	Ready	Preview, Comment	Jun 14, 2026 3:12am

The Mock-ascend workflow uploads one bmk_<config>_conc<N>_<gpu>_<idx> artifact per concurrency instead of a single aggregated results_bmk. The unofficial-run API was filtering strictly on name === 'results_bmk', returning 404 even though the per-config artifacts contained valid rows. When results_bmk is absent, gather every artifact whose name starts with 'bmk_', download each, and concat the rows before normalization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Oseltamivir · 2026-05-06T20:44:01Z

@claude

Make it able to render 8192 input tokens, 256 output tokens, like from runs/25457884421

github-actions · 2026-05-06T20:45:50Z

Oseltamivir · 2026-05-06T20:55:24Z

@claude

Make it able to render 8192 input tokens, 256 output tokens, like from runs/25457884421

Use this PR/branch, don't create a new branch

github-actions · 2026-05-06T20:56:57Z

Claude finished @Oseltamivir's task in 4m 23s —— View job

Done — registered 8K/256 sequence (8192/256) on this PR branch

Pushed commit f1549ed to claude/huawei-950dt-mindie-unofficial. Sweep run 25457884421 ships isl=8192, osl=256; islOslToSequence previously returned null and the rows were silently dropped by the unofficial-run viewer.

Changes

packages/constants/src/models.ts — added '8k/256' ↔ {isl: 8192, osl: 256} in both sequenceToIslOsl and islOslToSequence.
packages/app/src/lib/data-mappings.ts — added Sequence.EightK_256 = '8k/256' and a SEQUENCE_CONFIG entry (label '8K / 256', compact '8k256', category 'default'). Replaced the binary 1k-vs-8k ternary in getModelAndSequenceFromArtifact with islOslToSequence, so future ISL/OSL pairs registered in constants are picked up automatically rather than silently mis-mapped (the old ternary would have mapped osl=256 → 8k, producing a wrong 8k1k sequence key).
Tests — added round-trip + label coverage in packages/constants/src/models.test.ts and packages/app/src/lib/models-mapping.test.ts, plus a new getModelAndSequenceFromArtifact case for dsv4 + 8192/256 in packages/app/src/lib/data-mappings.test.ts.

The flow that now works for ?unofficialrun=25457884421:

Artifact JSON ships isl: 8192, osl: 256 → mapBenchmarkRow passes through.
unofficial-run-provider.tsx:113 calls islOslToSequence(8192, 256) → returns '8k/256' (was null).
GlobalFilterContext availableSequences picks up '8k/256' from unofficialAvailable, the sequence dropdown surfaces it, and useChartData filters rows by {isl: 8192, osl: 256}.

Checks

pnpm typecheck — clean
pnpm --filter @semianalysisai/inferencex-constants test:unit — 22/22 passed
pnpm --filter @semianalysisai/inferencex-app test:unit — 1689/1689 passed
pnpm lint — 0 new errors. The 41 errors reported are pre-existing on master (verified by git stash && pnpm lint); they're all no-underscore-dangle violations in files I didn't touch.

Not done — needs your verification

No browser verification. The dev server in this sandbox can't fetch GitHub artifacts, so I can't load ?unofficialrun=25457884421 and visually confirm the overlay points render. AGENTS.md flags Playwright verification as mandatory for inference-path features — please load the Vercel preview at ?unofficialrun=25457884421 once it deploys and confirm DSv4 overlay points appear under the new 8K / 256 sequence option.
• Branch: claude/huawei-950dt-mindie-unofficial

Registers '8k/256' (isl=8192, osl=256) so unofficial-run overlays from sweeps like InferenceX run 25457884421 render instead of being silently dropped by islOslToSequence returning null. - packages/constants/src/models.ts: add 8k/256 to both conversion maps. - packages/app/src/lib/data-mappings.ts: add Sequence.EightK_256 with SEQUENCE_CONFIG entry (label "8K / 256"). Replace the binary 1k-vs-8k ternary in getModelAndSequenceFromArtifact with islOslToSequence so future ISL/OSL pairs registered in constants are picked up automatically rather than mis-mapped. - Tests added for round-trip + label coverage in models.test.ts / models-mapping.test.ts and a dsv4 + 8192/256 case in data-mappings.test.ts. Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

Oseltamivir · 2026-05-07T04:01:26Z

@claude

For unofficial runs, add a toggle where the dashboard treats them as ingested runs.

They actual names in the legend, if a job has multiple they're treated as separate lines
When this is toggled, the plots should be susceptible to the other toggles(such as "Optimal only", currently they don't.
Ensure everything works for multiple unofficial runs
Animation isn't necessary

github-actions · 2026-05-07T04:02:43Z

Claude finished @Oseltamivir's task in 24m 55s —— View job

Done — "Show as ingested" toggle for unofficial runs

Pushed 5e327a7 to claude/huawei-950dt-mindie-unofficial. When the toggle is on, unofficial-run rows are promoted into the same series pipeline as DB-ingested data, so they participate in Optimal-only, hardware toggles, precision filters, and per-roofline grouping.

How it works

Each (run, original hwKey) pair is rewritten to a synth key ${origHwKey}__uorun${runId} (e.g. b200_vllm__uorun25457884421). The GPU base prefix is preserved so getModelSortIndex / isKnownGpu keep working — only the suffix is new.
A synthesized HardwareEntry is generated for each synth key with label = "${origLabel} • ${branch}" so the legend shows e.g. B200 • infinitebench-dsv4-b200-20260506-125800 (vLLM).
A per-key color override (from the existing overlayRunColor palette) is threaded through InferenceContext → ScatterGraph's resolveColor so each run still gets its distinct color even after the GPU vendor system would normally re-paint by base GPU.
Multiple runs surface as separate legend entries; one job that benchmarks several GPUs surfaces those GPUs as separate legend entries too — (run, GPU) is the legend granularity.
The X-shape overlay layer is suppressed when merged, and stale .unofficial-overlay-pt / .overlay-roofline-path DOM is cleaned up on the toggle-flip (the D3 layer system has no built-in teardown for dropped layers).

What changed

packages/app/src/lib/unofficial-merge.ts (new) — mergeUnofficialIntoOfficial, makeSynthHwKey, parseSynthHwKey, isSynthHwKey, buildStubGraphsForMerge. Reuses processOverlayChartData so the merged points sit in the same x/y coordinate space the official pipeline uses.
packages/app/src/lib/unofficial-merge.test.ts (new) — 10 tests covering synth-key round-trips, no-op behavior when no overlay matches the (model, sequence), single-run merge, multi-run separation (same GPU under two runs gets two synth keys + two colors), preserving official rows, and stub-graph synthesis when official graphs is empty.
packages/app/src/components/unofficial-run-provider.tsx — adds mergeAsIngested + setMergeAsIngested to context. URL-synced via i_uoff_ingested=1 (with a hydration-safe useEffect re-sync).
packages/app/src/components/inference/InferenceContext.tsx — calls the merger after useChartData, exposes merged graphs/hardwareConfig and hwColorOverrides through context. Result memoized, so the toggle adds zero overhead when off.
packages/app/src/components/inference/ui/ScatterGraph.tsx — wraps resolveColor to consult hwColorOverrides first; adds the "Show as ingested" switch (only rendered when isUnofficialRun); cleanup effect removes stale overlay DOM when overlayData drops to null.
packages/app/src/components/inference/ui/ChartDisplay.tsx — sets overlayDataByChartType to null when mergeAsIngested is on, so the X-shape overlay layer doesn't render alongside the merged points.
packages/app/src/lib/url-state.ts — registers i_uoff_ingested as a known URL state key.
packages/app/src/components/inference/types.ts, packages/app/cypress/support/mock-data.ts — type-level wiring for hwColorOverrides and the new toggle fields.

Analytics: latency_unofficial_as_ingested_toggled fires on every flip via the existing track() convention.

Browser verification (Playwright)

Walked through the user-facing flow against localhost:3000:

?unofficialrun=25457884421 (InfiniteBench DSv4 8K/256, B200 vLLM) — toggling on flips the 2 X-shape overlay points to ingested-style dot-groups with synth hwKey = "b200_vllm__uorun25457884421". Optimal-only is on; both points are Pareto-optimal so both render. Toggling off returns to overlay X-shapes (URL param drops too).
?unofficialruns=25014782858 (Mock-ascend, DSv4 8K/1K) — synth hwKey = "950dt_huawei-mindie_mtp__uorun25014782858" co-renders with 374 official DSv4 points; no overlay X-shapes remain.
?unofficialruns=25457884421,25014782858 — both run chips show in the banner; each contributes its own synth-keyed legend entry on the slices where it has data.
Reload with i_uoff_ingested=1 in the URL → toggle starts checked (state is hydration-safe, with a useEffect re-sync to handle the SSR default).

Checks

pnpm typecheck — clean
pnpm --filter @semianalysisai/inferencex-app test:unit — 1699/1699 passed (10 new tests in unofficial-merge.test.ts)
pnpm --filter @semianalysisai/inferencex-constants test:unit — 22/22 passed
pnpm lint — 0 new errors (the 41 reported are pre-existing on master, all no-underscore-dangle in files I didn't touch — verified by git stash && pnpm lint)
Browser-verified with Playwright per the cases above
· branch: claude/huawei-950dt-mindie-unofficial

…ries Adds a "Show as ingested" switch in the inference scatter legend (visible when an unofficial run is loaded) that promotes overlay rows to first-class points so they participate in the same filter pipeline as DB-ingested data: hardware toggles, Optimal-only, precision filtering, and per-(hwKey, run) roofline grouping. Each (run, original hwKey) pair is rewritten to a synthesized hwKey of the form `${origHwKey}__uorun${runId}` — preserving the GPU base prefix so `getModelSortIndex` and `isKnownGpu` keep working — with a synthesized HardwareEntry whose label embeds the run branch and a per-run color override drawn from the existing overlay palette. Multiple runs and per-run multi-GPU configs each surface as their own legend entry. URL-synced via `i_uoff_ingested=1` so the state survives reloads and shares. The X-shape overlay layer is suppressed and stale DOM is cleaned up when the toggle flips on (the layer system has no built-in teardown for layers that drop out of the array). Unit tests cover synth-key encoding, no-op behavior when no overlay matches, and multi-run separation. Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

Oseltamivir · 2026-05-07T05:33:54Z

@claude

for the previous edit, the results look good, but continue to build on it:

I don't want the name "infinitebench-dsv4-b200-20260506-125800" to be in the legend
I want the colours to follow ingested runs, the B300 plots are showing up as the same red. They should be different shades of green for Nvidia

github-actions · 2026-05-07T06:13:29Z

Claude finished @Oseltamivir's task in 12m 4s —— View job

Done — bare GPU labels + vendor-zone colors for ingested unofficial rows

Pushed 323d1b4 to claude/huawei-950dt-mindie-unofficial. Both fixes live in packages/app/src/lib/unofficial-merge.ts.

What changed

Legend label drops the branch. makeSynthHardwareEntry previously produced "${baseLabel} • ${branch}"; now it returns just baseLabel (e.g. Ascend 950DT (Huawei MindIE, MTP) instead of ... • mock-ascend-950dt-deepseekv4-20260427-185322). The branch + run URL still surface in gpu, so the row tooltip retains run provenance.
No more per-run color override. The merge no longer populates colorOverrides[synthHwKey] = overlayRunColor(runIdx). Synth keys preserve the original GPU base prefix (b200_vllm__uorun123), so getVendor() recognizes them and generateVendorColors distributes a hue from the vendor's OKLCh zone — exactly the path official rows already use. That's why two NVIDIA GPUs from one unofficial run now get different shades of green rather than two copies of the same --overlay-run-0 red.

The wiring (hwColorOverrides → ScatterGraph.resolveColor) stays in place; the merge just returns an empty {} map. Future callers can still pin a specific color per synth key without further refactors.

Browser verification (Playwright)

Loaded ?unofficialruns=25457884421,25014782858&i_uoff_ingested=1 against localhost:3000, switched to DSv4 / 8K / 1K, hit "Reset filter" to expand the legend, and dumped the swatch styles:

Ascend 950DT (Huawei MindIE, MTP) → oklch(0.64 0.18 75) — Huawei amber zone (60–90)
GB300 NVL72 (Dynamo SGLang) → oklch(0.78 0.15 121.7) — NVIDIA green
B300 (vLLM) → oklch(0.62 0.15 148.3) — NVIDIA green, not red
B200 (vLLM) → oklch(0.52 0.15 165) — NVIDIA green, distinct shade
MI355X (SGLang) → oklch(0.5 0.22 34.5) — AMD red
MI355X (ATOM¹) → oklch(0.78 0.22 19.5) — AMD red

Each NVIDIA GPU sits in the 121.7–168.3 hue band — a different shade per GPU, exactly the spec. No legend label contains "infinitebench-..." or "mock-ascend-..." anywhere. Switched to 8K / 256 to confirm the InfiniteBench overlay (B200 (vLLM)) also receives a green hue (oklch(0.64 0.15 145)) instead of the previous overlay-run red.

Tests

Added 3 integration cases to packages/app/src/lib/unofficial-merge.test.ts:

two NVIDIA synth keys (B200 + B300) from a single run get distinct colors;
vendor-zone hues stay inside their bands (NVIDIA 120–170, AMD 12–42 per VENDOR_OKLCH_ZONES);
two synth keys for the same GPU but different runs still get distinct hues.

Updated the existing label/color assertions to expect bare GPU labels and an empty colorOverrides map. The gpu field is still asserted to contain UNOFFICIAL: <branch> so we don't regress the tooltip provenance.

Checks

pnpm typecheck — clean
pnpm exec vitest run src/lib/unofficial-merge.test.ts — 13/13 passed (10 original + 3 new)
pnpm exec vitest run (whole app) — 1702/1702 passed
pnpm exec oxlint packages/app/src/lib/unofficial-merge.ts unofficial-merge.test.ts — 0 errors. The 41 pnpm lint repo-wide errors are pre-existing on master (no-underscore-dangle in files I didn't touch).
Browser-verified with Playwright per the swatch dump above
· branch: claude/huawei-950dt-mindie-unofficial

Drops the branch suffix from synth-key labels and the per-run color override so unofficial rows promoted via "Show as ingested" share the same vendor-aware palette as official rows. Two NVIDIA GPUs from one unofficial run now get distinct shades of green instead of one shared overlay-palette red. The branch + run URL still surface in `gpu` so the row tooltip retains provenance. Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

Oseltamivir · 2026-05-07T21:00:00Z

@claude

Just for this branch, temporarily set default view to "8k / 256" and "Output Token Throughput per GPU". This default should apply after unofficial runs automatically changes model (introduced in #243 )

github-actions · 2026-05-07T21:01:23Z

Claude finished @Oseltamivir's task in 8m 40s —— View job

Done — temp default to `8K / 256` + `Output Token Throughput per GPU` on unofficial-run load

Pushed 462dc0f to claude/huawei-950dt-mindie-unofficial. When ?unofficialrun= loads and the URL doesn't pin i_seq / i_metric, the dashboard now defaults the sequence to 8K / 256 and the y-axis metric to Output Token Throughput per GPU, alongside the existing model auto-switch from #243.

What changed

packages/app/src/lib/unofficial-run-auto-switch.ts — added a sibling pure helper computeUnofficialOverrideDecision(unofficialAvailable, urlValue, lastKey) that mirrors computeAutoSwitchDecision's dedupe pattern: empty overlay → reset key, URL-pinned param → no-op (preserve the same key so a later URL clear can re-fire), same model-set key → no-op, otherwise → fire override and advance key. Same sorted-models key shape as the auto-switch so a sequence-only delta doesn't invalidate a manual user pick.
packages/app/src/components/GlobalFilterContext.tsx — added a useEffect that calls the helper with getUrlParam('i_seq') and a lastUnofficialSeqOverrideRef, then setSelectedSequence(Sequence.EightK_256) when shouldOverride is true. Lives next to the existing lastAutoSwitchKeyRef effect so the two are reviewed together.
packages/app/src/components/inference/InferenceContext.tsx — analogous effect using getUrlParam('i_metric') + lastUnofficialMetricOverrideRef, calling setSelectedYAxisMetric('y_outputTputPerGpu'). Pulled availableModelsAndSequences out of useUnofficialRun() (already in the same destructure block) so the effect can key off the same overlay set the auto-switch sees.
packages/app/src/lib/unofficial-run-auto-switch.test.ts — added 6 unit tests for the new helper: empty overlay resets, fresh fire with no URL pin, URL pin respected, no re-fire after applied, re-arm after overlay clears, sequence-only delta ignored, deterministic key across insertion orders.

Both effects are clearly tagged TEMPORARY (this branch only) so the override is easy to revert.

Browser verification (Playwright on `localhost:3000`)

?unofficialrun=25457884421 (no pins) — model auto-switched to DeepSeek V4 Pro, sequence dropdown landed on 8K / 256, y-axis on Output Token Throughput per GPU, NON-OFFICIAL banner present.
?unofficialrun=25457884421&i_seq=1k%2F1k&i_metric=y_tpPerGpu — sequence stayed at 1K / 1K, y-axis stayed at Token Throughput per GPU (URL pins respected). Model still auto-switched because g_model wasn't pinned.

Checks

pnpm typecheck — clean
pnpm --filter @semianalysisai/inferencex-app exec vitest run src/lib/unofficial-run-auto-switch.test.ts — 15/15 passed
pnpm --filter @semianalysisai/inferencex-app test:unit — 1773/1773 passed
pnpm lint — 0 warnings, 0 errors
Browser-verified per the cases above

Notes

The 2 console errors observed during the URL-pinned navigation case are pre-existing (a script-tag warning and a SSR/CSR hydration mismatch on SequenceSelector from the existing useState initializer reading the URL on the client — happens for any i_seq= URL pin, regardless of this change). Not introduced by this commit.
The override is gated by both the dedupe ref and the URL-pin check, so manual sequence/metric changes (which write back to the in-memory URL store and then to the address bar) are safe — the next render reads the pinned URL value via getUrlParam and bails. Switching to a different unofficial run with a different model set re-arms the override (same behavior as the model auto-switch).
· branch: claude/huawei-950dt-mindie-unofficial

… on unofficial-run load Temporary branch-only override: when a `?unofficialrun=` URL loads and the user did not pin `i_seq` / `i_metric` themselves, default the sequence to `8K / 256` and the y-axis metric to `Output Token Throughput per GPU` so InfiniteBench-style sweeps land on a useful default view alongside the model auto-switch from #243. Mirrors the dedupe shape of `computeAutoSwitchDecision` via a new pure helper `computeUnofficialOverrideDecision` so manual user picks stick once URL-synced and a fresh run-set transition can re-arm the override. Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

Oseltamivir · 2026-06-13T23:47:00Z

Validated the offline B300 TRT path end to end.

InferenceX run 27482213487 now uploads canonical results_bmk/agg_bmk.json.
/api/unofficial-run?runId=27482213487 returns one B300 / DSV4 / TRT / FP4 / MTP row at 8192/625, concurrency 32.
buildChartData creates one E2E and one interactivity point under DeepSeek-V4-Pro_8k/625.
Commit 7b4dc87 keeps the branch's 8K/256 preference when a run provides that shape, otherwise it selects the run's actual sequence so the bare unofficial-run URL does not hide this 8K/625 point.

Current preview: https://inferencemax-app-git-claude-huawei-950dt-0f2683-semianalysisai.vercel.app/inference?unofficialrun=27482213487

Validation: full unit suite 2,016 tests passed; typecheck, lint, and format checks passed.

Oseltamivir · 2026-06-14T00:10:54Z

@claude

Allow for selection of multiple sequence length options on one graph. (e.g. 1k1k with 8k1k)

github-actions · 2026-06-14T00:12:37Z

Claude finished @Oseltamivir's task in 25m 15s —— View job

Done — multi-select ISL/OSL on the inference chart

Pushed 747f1ad to claude/huawei-950dt-mindie-unofficial. The sequence picker is now a multi-select — pick 1K/1K and 8K/1K together and both overlay on the same chart, each (GPU, sequence) as its own legend line.

How it works

New state lives in InferenceContext: extraSequences: Sequence[] with URL param i_seq_extra. Primary selectedSequence semantics are unchanged — the calculator / evaluation / trends tabs see exactly the same singular value they did before. Only the inference tab opts in.
New helper module packages/app/src/lib/sequence-synth-key.ts: builds ${origHwKey}__seq<compact> synth hwKeys (e.g. b200_vllm__seq1k1k). Preserves split('_')[0] so getModelSortIndex / isKnownGpu / getVendor keep working — that's why two NVIDIA GPUs across two sequences still land in the NVIDIA hue band but with distinct shades, instead of needing a separate palette.
useChartData filters rows by IN-list, groups by sequence, runs transformBenchmarkRows per group, and rewrites each output point's hwKey with the synth form. Synth HardwareConfig entries get the sequence appended to their label ("B200 — 1K / 1K"). filterByGPU is widened to strip the __seq suffix so the GPU selector still matches across sequences.
mergeUnofficialIntoOfficial now iterates over (primary + extras). When multi-sequence is on, the synth key shape composes as base__seq<compact>__uorun<id> — the seq tag precedes the run tag so parseSynthHwKey keeps round-tripping.
New MultiSequenceSelector in chart-selectors.tsx (max 3 picks). The singular SequenceSelector is left intact for the calculator's needs.
Legend reset effect now keys off extraSequences and mergeAsIngested so newly-introduced synth hwKeys become active when the user flips either toggle — fixes the Cursor "Merged series stay legend-inactive" finding on commit 7b4dc87. (The Cursor "Blind 8k/256 hides unofficial rows" finding is already addressed by selectUnofficialDefaultSequence in the same prior commit.)

Tests

sequence-synth-key.test.ts (new, 12 cases) — compact form, round-trip, base prefix preservation, label tagging, __uorun composition, edge cases (unknown compact, missing suffix).
unofficial-merge.test.ts (2 new cases) — multi-sequence merge produces base__seq<compact>__uorun<id>; duplicate primary in extras dedups to single-sequence behavior.
useChartData.test.ts (1 new case) — filterByGPU strips __seq suffix so canonical GPU picks match across sequences.

Browser verification (Playwright on `localhost:3000`)

/inference → opened the ISL/OSL picker, selected 8K/1K + 1K/1K → both chips show, legend renders e.g. "GB300 NVL72 — 1K / 1K (Dynamo SGLang)" and "GB300 NVL72 — 8K / 1K (Dynamo SGLang)" side by side; chart shows two roofline groups per GPU.
URL state round-trips: ?i_seq=8k%2F1k&i_seq_extra=1k%2F1k reloads with both selections active.
Unofficial-run overlay composition: ?unofficialrun=27482213487&i_seq_extra=8k%2F1k&i_uoff_ingested=1 — B300 — 8K / 625 (from the run) and B300 — 8K / 1K (official) both appear in the legend simultaneously. The sequence-tag is the inner suffix and the run-tag is the outer suffix, so the existing unofficial-merge code composes cleanly.

Checks

pnpm typecheck — clean
pnpm exec vitest run — 1798/1798 passed, including new tests
pnpm exec oxlint on touched files — 0 new errors. The 6 reported are pre-existing on master (no-negated-condition / require-unicode-regexp in files I didn't add); verified via git stash && pnpm lint.
Browser-verified per the cases above

Notes for review

Default is a single sequence — nothing changes for users until they explicitly pick a second one. Backwards-compatible.
The seq-synth key shape (base__seq<compact>) shares a delimiter convention with the unofficial-merge synth shape (base__uorun<id>) so they compose; parseSynthHwKey was already structured to peel only the trailing __uorun…, which is what we relied on.
Cursor's "Popstate skips ingested toggle sync" finding is out of scope for this turn (separate from multi-sequence) and untouched.
Out of scope: I did NOT touch the calculator / evaluation / historical-trends tabs. They keep using singular selectedSequence.
· branch: claude/huawei-950dt-mindie-unofficial

… one chart The sequence picker becomes a multi-select. Picking >1 (e.g. 1K/1K + 8K/1K) overlays both shapes on the same axes — each (GPU, sequence) becomes its own legend line via a `__seq<compact>` hwKey suffix, so vendor-zone colors still distinguish hardware while the legend caption reads e.g. "B200 — 1K / 1K" vs "B200 — 8K / 1K". State: new `extraSequences: Sequence[]` lives in InferenceContext (URL param `i_seq_extra`, empty default). Primary `selectedSequence` semantics are unchanged for the calculator / evaluation / trends tabs. Pipeline: `useChartData` filters rows by `IN [primary, ...extras]`, groups by sequence, runs `transformBenchmarkRows` per sequence, and rewrites each output point's hwKey with `makeSeqSynthKey`. Synth hardware entries get the sequence appended to their label. `filterByGPU` is widened to strip the `__seq` suffix so the GPU selector still matches across sequences. Unofficial-run merge: when `mergeAsIngested` is on with multiple sequences, the merger iterates over (primary + extras), composing `base__seq<compact>__uorun<id>` synth keys so a (run, GPU, sequence) triple gets its own legend line. `parseSynthHwKey` keeps working because the seq delimiter precedes the run delimiter. Legend reset effect now also keys off `extraSequences` and `mergeAsIngested` so newly-introduced synth hwKeys become active when the user flips either toggle — fixes the Cursor "Merged series stay legend-inactive" finding on the prior commit where `reconcileActiveSet` only removed stale keys. Tests: - `sequence-synth-key.test.ts` (new) — round-trip, base prefix preservation, label tagging, compose with `__uorun`. - `unofficial-merge.test.ts` — multi-sequence merge produces `__seq__uorun` shape; primary-only with duplicate extra dedup is a no-op. - `useChartData.test.ts` — `filterByGPU` strips `__seq` suffix. Verified in Playwright at localhost:3000/inference: legend renders e.g. "GB300 NVL72 — 1K / 1K (Dynamo SGLang)" alongside the 8K / 1K variant, URL state restores from `?i_seq_extra=`, and overlay composition works with `?unofficialrun=27482213487&i_seq_extra=8k%2F1k&i_uoff_ingested=1` (B300 — 8K / 625 from the run + B300 — 8K / 1K official both render). Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

Oseltamivir · 2026-06-14T02:55:20Z

@claude merge main in, and resolve or address the PR reviews

github-actions · 2026-06-14T02:56:40Z

Claude finished @Oseltamivir's task in 15m 41s —— View job

Done — merged `master` + addressed all four Cursor findings

Pushed two commits to claude/huawei-950dt-mindie-unofficial:

703c6b9 — merge origin/master (91 commits behind, 9 file conflicts resolved).
c9c31a7 — Cursor fixes chore: bump jsdom from 28.1.0 to 29.0.0 #4 and chore: add trailing newline to README #5 + a small lint cleanup.

Cursor findings — resolution status

#	Finding	Severity	Status
1	Blind 8k/256 hides unofficial rows (`GlobalFilterContext.tsx`)	Medium	✅ Already addressed in `7b4dc87` — `selectUnofficialDefaultSequence(unofficialAvailable, selectedModel, urlModel)` only returns `EightK_256` when it's in the run's available sequences for the target model, otherwise falls back to a sorted available sequence. Verified the merged code still has the same logic.
2	Merged series stay legend-inactive (`InferenceContext.tsx:316`)	High	✅ Already addressed in `747f1ad` — the hw-reset key now includes `${extraSequencesKey}
3	Popstate skips ingested toggle sync (`unofficial-run-provider.tsx`)	Low	✅ Fixed in this turn — `load` now takes an `isPopstate` flag and re-reads `params.get('i_uoff_ingested')` on browser back/forward.
4	Merge toggle cleared after URL hydration (`unofficial-run-provider.tsx`)	Medium	✅ Fixed in this turn — both the `useState` initializer and the hydration `useEffect` now use `readUrlParams()` (the in-memory snapshot captured at module load, before `url-state.ts`'s deferred cleanup), so share links with `?i_uoff_ingested=1` no longer race against URL cleanup.

Why a single fix can't cover both popstate (#3) and hydration (#4)

On initial mount, the URL may have already been stripped by url-state.ts's setTimeout(0) cleanup, so the live URL is unreliable — we must read the in-memory snapshot.
On popstate, the browser has restored the previous history entry, so the live URL is the source of truth — and the snapshot is stale.
The load(isPopstate) flag picks the right source per call.

Why I didn't add a unit test for the popstate path

The existing unofficial-run-provider.test.ts only exercises the pure helpers (buildChartData, parseAvailableModelsAndSequences). The provider itself isn't rendered there, and adding a render harness for one regression case felt out of scope. The behavior is exercised by browser verification + the logic is small.

Checks

pnpm typecheck — clean
pnpm lint — 0 errors (also cleaned up 2 pre-existing nits in unofficial-merge.test.ts that I touched)
pnpm --filter @semianalysisai/inferencex-app exec vitest run — 2090/2090 passed
pnpm --filter @semianalysisai/inferencex-constants exec vitest run — 30/30 passed
Browser-verified with Playwright that ?unofficialrun=…&i_uoff_ingested=1 hydrates with aria-checked="true" on the scatter-uoff-as-ingested switch, even after url-state.ts runs its deferred URL cleanup. (Did not browser-test popstate end-to-end — sandboxed Playwright + dev mode makes history mutation testing brittle; the popstate code path is small and the logic is straightforward.)

Merge-conflict resolutions (commit `703c6b9`)

9 files had textual conflicts — all were "both-sides-added" merges, no semantics had to be reconciled:

packages/constants/src/framework-aliases.ts — huawei-mindie + mooncake-atom both registered
packages/app/cypress/support/mock-data.ts — hwColorOverrides / extraSequences + compareGpuPair both in mock context
packages/app/src/app/api/unofficial-run/route.test.ts — both new test cases preserved
packages/app/src/lib/data-mappings.ts — both new imports preserved
packages/app/src/components/inference/types.ts — both new context fields preserved
packages/app/src/components/inference/hooks/useChartData.ts — getHardwareConfig + hardwareKeyMatchesAnyBase both imported; extraSequences + compareGpuPair both parameters
packages/app/src/components/inference/InferenceContext.tsx — switched hasMtpEngineExclusion → getModelExclusion (master refactored exclusion API); kept Sequence import for extraSequences; threaded both effectiveExtraSequences and compareGpuPair into useChartData and context
packages/app/src/components/inference/ui/ChartDisplay.tsx — both mergeAsIngested and activeOverlayHwTypes destructured
packages/app/src/components/inference/ui/ScatterGraph.tsx — both the "Show as ingested" toggle and master's reshaped log-scale toggle preserved
· branch: claude/huawei-950dt-mindie-unofficial

…-mindie-unofficial # Conflicts: # packages/app/cypress/support/mock-data.ts # packages/app/src/app/api/unofficial-run/route.test.ts # packages/app/src/components/inference/InferenceContext.tsx # packages/app/src/components/inference/hooks/useChartData.ts # packages/app/src/components/inference/types.ts # packages/app/src/components/inference/ui/ChartDisplay.tsx # packages/app/src/components/inference/ui/ScatterGraph.tsx # packages/app/src/lib/data-mappings.ts # packages/constants/src/framework-aliases.ts

Two related Cursor findings on the `i_uoff_ingested` URL param: 1. **Merge toggle cleared after URL hydration** — the post-mount sync was reading `window.location.search`, but `url-state.ts` strips share-link params on a deferred `setTimeout(0)`. If that cleanup ran before our effect, a shared link with `i_uoff_ingested=1` would silently turn the toggle off. Both the `useState` initializer and the hydration effect now read `readUrlParams()` (in-memory snapshot captured before cleanup). 2. **Popstate skips ingested toggle sync** — the `popstate` listener re-loaded run data but never re-read `i_uoff_ingested`. Browser back/forward could leave the toggle out of sync with the URL. The load function now takes an `isPopstate` flag and re-syncs the toggle from `window.location.search` only on popstate (not on the initial call, which would race with the cleanup above). Also: drop a stale lint nit in `unofficial-merge.test.ts` by switching the hue-extraction regex to a named-capture-group + unicode flag. Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>

blacksmith-sh · 2026-06-14T03:13:48Z

Found 8 test failures on Blacksmith runners:

Failures

Test	View Logs
`First-load navigation/navigates to comparisons from the header with one click`	View Logs
`Inference ChartControls/renders the sequence selector with the current sequence`	View Logs
`Model Architecture Diagram › Alternating Attention Blocks (MoE model - gpt-oss 120B)/` `"before all" hook for "shows MoE and Sink/Full GQA badges for gpt-oss"`	View Logs
`Model Architecture Diagram › Alternating Attention Blocks (MoE model - gpt-oss 120B)/` `"before all" hook for "shows MoE and Sink/Full GQA badges for gpt-oss"`	View Logs
`Model Architecture Diagram › Collapsible Transformer Blocks (MoE model - MiniMax M3)/` `"before all" hook for "shows MoE and GQA badges for MiniMax M3"`	View Logs
`Model Architecture Diagram › Collapsible Transformer Blocks (MoE model - MiniMax M3)/` `"before all" hook for "shows MoE and GQA badges for MiniMax M3"`	View Logs
`Model Architecture Diagram › Hybrid Attention Blocks (MoE model - DeepSeek V4 Pro)/` `"before all" hook for "shows MoE and Hybrid badges for DeepSeek V4 Pro"`	View Logs
`Model Architecture Diagram › Hybrid Attention Blocks (MoE model - DeepSeek V4 Pro)/` `"before all" hook for "shows MoE and Hybrid badges for DeepSeek V4 Pro"`	View Logs

^{Need help on this PR? Tag /codesmith with what you need.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.}

cursor · 2026-06-14T03:13:56Z

+  // official series via InferenceContext, so suppress the X-shape overlay layer.
  const overlayDataByChartType = useMemo(() => {
-    if (!unofficialRunInfo || !getOverlayData) {
+    if (mergeAsIngested || !unofficialRunInfo || !getOverlayData) {


Overlay ignores extra sequences

Medium Severity

With unofficial overlay mode (not “Show as ingested”), overlay points are loaded only for the primary selectedSequence. Extra ISL/OSL shapes from extraSequences appear in official useChartData series but never in the X-shape overlay, so unofficial points for those sequences are missing on the same chart.

^{Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.}

cursor · 2026-06-14T03:13:56Z

+    const url = new URL(window.location.href);
+    if (v) url.searchParams.set('i_uoff_ingested', '1');
+    else url.searchParams.delete('i_uoff_ingested');
+    window.history.replaceState({}, '', url);


Share URL drops ingested toggle

Medium Severity

Toggling “Show as ingested” updates the address bar via history.replaceState but does not call writeUrlParams, so i_uoff_ingested never enters currentState. buildShareUrl only serializes currentState (plus unofficial run ids from the live URL), so generated share links omit the merge-as-ingested setting even when the toggle is on.

Additional Locations (1)

packages/app/src/lib/url-state.ts#L200-L227

^{Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.}

vercel Bot deployed to Preview April 27, 2026 19:40 View deployment

Oseltamivir closed this Apr 27, 2026

Oseltamivir reopened this Apr 27, 2026

Oseltamivir closed this Apr 27, 2026

Oseltamivir reopened this May 6, 2026

vercel Bot deployed to Preview May 6, 2026 21:01 View deployment

vercel Bot deployed to Preview May 7, 2026 04:27 View deployment

vercel Bot deployed to Preview May 7, 2026 06:25 View deployment

Merge branch 'master' into claude/huawei-950dt-mindie-unofficial

ef3f21d

vercel Bot deployed to Preview May 7, 2026 07:11 View deployment

vercel Bot deployed to Preview May 7, 2026 21:10 View deployment

Oseltamivir changed the title ~~feat(unofficial): support 950dt + huawei-mindie for unofficial runs~~ feat(unofficial): mock 950dt + huawei-mindie for unofficial runs May 26, 2026

feat(unofficial): support 8k625 benchmark rows

1127bd2

Oseltamivir requested a review from adibarra as a code owner June 13, 2026 23:21

vercel Bot deployed to Preview June 13, 2026 23:22 View deployment

cursor Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread packages/app/src/components/GlobalFilterContext.tsx Outdated

Comment thread packages/app/src/components/unofficial-run-provider.tsx

fix(unofficial): show available run sequence

7b4dc87

vercel Bot deployed to Preview June 13, 2026 23:45 View deployment

cursor Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread packages/app/src/components/inference/InferenceContext.tsx

vercel Bot deployed to Preview June 14, 2026 00:38 View deployment

cursor Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread packages/app/src/components/unofficial-run-provider.tsx

github-actions Bot and others added 2 commits June 14, 2026 03:03

vercel Bot deployed to Preview June 14, 2026 03:12 View deployment

cursor Bot reviewed Jun 14, 2026

View reviewed changes

Conversation

Oseltamivir commented Apr 27, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Notes for review

Test plan

Uh oh!

vercel Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Oseltamivir commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — registered the 8K/256 sequence (8192/256)

What changed

Checks

Not done

Todos

Uh oh!

Oseltamivir commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — registered 8K/256 sequence (8192/256) on this PR branch

Changes

Checks

Not done — needs your verification

Uh oh!

Oseltamivir commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — "Show as ingested" toggle for unofficial runs

How it works

What changed

Browser verification (Playwright)

Checks

Uh oh!

Oseltamivir commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — bare GPU labels + vendor-zone colors for ingested unofficial rows

What changed

Browser verification (Playwright)

Tests

Checks

Uh oh!

Oseltamivir commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — temp default to 8K / 256 + Output Token Throughput per GPU on unofficial-run load

What changed

Browser verification (Playwright on localhost:3000)

Checks

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Oseltamivir commented Jun 13, 2026

Uh oh!

Oseltamivir commented Jun 14, 2026

Uh oh!

github-actions Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Done — multi-select ISL/OSL on the inference chart

How it works

Tests

Browser verification (Playwright on localhost:3000)

Checks

Notes for review

Uh oh!

Uh oh!

Oseltamivir commented Jun 14, 2026

Uh oh!

github-actions Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Oseltamivir commented Apr 27, 2026 •

edited by cursor Bot

Loading

vercel Bot commented Apr 27, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 7, 2026 •

edited

Loading

github-actions Bot commented May 7, 2026 •

edited

Loading

github-actions Bot commented May 7, 2026 •

edited

Loading

Done — temp default to `8K / 256` + `Output Token Throughput per GPU` on unofficial-run load

Browser verification (Playwright on `localhost:3000`)

github-actions Bot commented Jun 14, 2026 •

edited

Loading

Browser verification (Playwright on `localhost:3000`)

github-actions Bot commented Jun 14, 2026 •

edited

Loading

Done — merged `master` + addressed all four Cursor findings

Merge-conflict resolutions (commit `703c6b9`)

blacksmith-sh Bot commented Jun 14, 2026 •

edited

Loading